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Abstract 

The paper presents a knowledge representation formalism, in the form 
of a high-level Action Description Language (ADL) for multi-agent sys- 
tems, where autonomous agents reason and act in a shared environment. 
Agents are autonomously pursuing individual goals, but are capable of 
interacting through a shared knowledge repository. In their interactions 
through shared portions of the world, the agents deal with problems of 
synchronization and concurrency; the action language allows the descrip- 
tion of strategies to ensure a consistent global execution of the agents' au- 
tonomously derived plans. A distributed planning problem is formalized 
by providing the declarative specifications of the portion of the problem 
pertaining a single agent. Each of these specifications is executable by 
a stand-alone CLP-based planner. The coordination among agents ex- 
ploits a Linda infrastructure. The proposal is validated in a prototype 
implementation developed in SICStus Prolog. 

To appear in Theory and Practice of Logtc Programming (TPLP). 
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1 Introduction 



Representing and reasoning in multi-agent domains are two of the most active 
research areas in multi-agent system (MAS) research. The Hterature in this area 
is extensive, and it provides a plethora of logics for representing and reasoning 
about various aspects of MAS domains, e.g., [201 HIM 1121 [12] ■ 

A large number of the logics proposed in the literature have been designed 
to specifically focus on particular aspects of the problem of modeling MAS, 
often justified by a specific application scenario. This makes them suitable to 
address specific subsets of the general features required to model real-world 
MAS domains. The task of generalizing some of these existing proposals to 
create a uniform and comprehensive framework for modeling several different 
aspects of MAS domains is an open problem. Although we do not dispute the 
possibility of extending several of these existing proposals in various directions, 
the task does not seem easy. Similarly, a variety of multi-agent programming 
platforms have been proposed, mostly in the style of multi-agent programming 
languages, like Jason [3j, ConGolog [9], 3APL [7j, GOAL but with hmited 
planning capabilities. 

Our effort in this paper is focused on the development of a novel action 
language for multi-agent systems. The foundations of this effort can be found 
in the action language [11] ; this is a flexible single-agent action language, 

which generalizes the action language B |13! with support for multi-valued flu- 
ents, non-Markovian domains, and constraint-based formulations — enabling, for 
example, the formulation of costs and preferences. B^^^ has been implemented 
in CLP (TV). 

In this work, we extend B^^^ to support MAS domains. The perspective 
is that of a distributed environment, with agents pursuing individual goals but 
capable of interacting through shared knowledge and through collaborative ac- 
tions. A first step in this direction has been described in the B^^^ language [TU] . 
a multi-agent action language with capabilities for centralized planning. In 
this paper, we expand on this by moving B^'^^ towards a truly distributed 
multi-agent platform. The language is extended with Communication primi- 
tives for modeling interactions among j4utonomous Agents. We refer to this 
language simply as K***^. Differently from B^^^ , agents in the framework pro- 
posed in this paper have private goals and are capable of developing independent 
plans. Agents' plans are composed in a distributed fashion, leading to replan- 
ning and/or introduction of communication activities to enable a consistent 
global execution. 



The design of B""'^ is vahdated in a prototype, available from http://www. 
jdimi .uniud.it/dovier/BAAC, that uses CLP(J^I?) for the development of the 
individual plans of each agent and Linda for the coordination and interaction 
among them. 
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2 Syntax of the Multiagent Language B"'"' 

The signature of S*'^'^ consists of: 

1. A set G of agent names, used to identify the agents in the system; 

2. A set A of action names; 

3. A set T of fluent names — i.e., predicates describing properties of objects in 
the world, and providing description of states of the world; such properties 
might be affected by the execution of actions; and 

4. A set V of values for the fluents in — we assume V = Z. 

The behavior of each agent a is specified by an action description theory Va, 
composed of axioms of the forms described next. 

Considering the action theory Va of an agent a, name and priority of the 
agent are specified by agent declarations: 

agent a [ priority n ] (1) 

where n S N. We adopt the convention that denotes the highest priority — 
which also represents the default priority in absence of a declaration. As we will 
see, priorities can be used to resolve possible conflicts among actions of different 
agents. 

It is possible to specify which agents are known to the agent a, as follows: 

known_agents ai, 02, . . • , flfc (2) 

Agent a can explicitly communicate with any of the agents a^, as discussed 
below. 

We assume the existence of a "global" set J" of fluents, and any agent a 
knows and can access only those fluents that are declared in Va by axioms of 
the form: 

fluent valued donii (3) 

with {/i, . . . ,fh} C J", /i > 1, and donii C V is a set of values representing 
the admissible values for each fi (possibly represented as an interval lvi,V2])- 
These fluents describe the "local state" of the agent. We assume that the fluents 
accessed by multiple agents are defined consistently in each agent's local theory. 

Example 1 Let us specify a domain inspired by volleyball. There are two teams: 
black and white, with one player in each team; let us focus on the domain 



for the white team (Sect. 3.8 deals with the case that involves more players). 
We introduce fluents to model the positions of the players and of the ball, the 
possession of the ball, the score, and a numerical fluent def ense_time. All 
players know the positions of all players. Since the teams are separated by the 
net, the 7i- coordinates of a black and white players must differ. This can be 
stated by: 
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agent player (white, X) :- iium(X) . 
known_agents player (black, X) :- nuiii(X) . 

fluent X (player (white, X)) valued [B,E] :- num(X) , net(NET),B is NET+1 , linex(E) . 

fluent x(player (black, X)) valued [1,E] :- num(X), net(NET),E is NET-1. 

fluent y(A) valued [1,MY] :- player(A), liney(MY) . 

fluent x(ball) valued [1,MX] :- linex(MX). 

fluent y(ball) valued [1,MY] :- liney(MY). 

fluent hasbalKA) valued [0,1] :- agent (A). 

fluent point (T) valued [0,1] :- teamd) . 

fluent def ense_time valued [0,1]. 

team (black) . teaiii(white) . numd). linex(ll) . net(6). llney(5). 
where linex/liney are the field sizes, and net is the in- coordinate of the net. 

□ 

Fluents are used in Fluent Expressions (FE), which are defined as follows: 

FE n | /* | FEi ® FEj | - (FE) | abs(FE) | rei(C) (4) 

where n e V, f e T, t e {0,-1,-2,-3,...}, © € {+,-,*,/, mod}, and r e N. 
FE is referred to as a timeless expression if it contains no occurrences of /* 
with t 0. f can be used as a shorthand of /°. The notation /* is an 
annotated fluent expression. The expression refers to a relative time refer- 
ence, indicating the value / had —t steps in the past. The last alternative 
in Q, a reified expression, requires the notion of constraint C, introduced be- 
low. rei(C) represents a Boolean value that reflects the truth value of C. A 
Primitive Constraint (PC) is formula FEi op FE2, where FEi and FE2 are fluent 
expressions, and op g {=,7^, >,<,>,<}• A constraint C is a propositional 
combination of PCs. We will refer to the primitive constraints of the form 
/ — FE, where / e J^, as a basic primitive constraint. We accept the con- 
straint pair(FEi, FE3) — pair(FE2, FE4) as syntactic sugar of FEi = FE2 and 
FE3 = FE4. 

An axiom of the form action x in Va, declares that the action x € A is 
executable by the agent a. Observe that the same action name x can be used for 
different actions executable by different agents. This does not cause ambiguity, 
since each agent's knowledge is described by its own action theory. A special 
action, nop, is executable by every agent, and it does not change any of the 
fluents. 

Example 2 The actions for each player A of Example [7] are: 

• A : move(d) one step in direction d, where d is one of the eight directions: 
north, north-east, east, west, north-west. 

• A : throw((i, /) the ball in direction d (same eight directions as above) with 
strength f varying from 1 to a maximum throw power (b in our example). 

Moreover, the player of each team is in charge of checking if a point has been 
scored (in such case, he whistles). We write the actions as act ( lA'] , action_ncmie) 
and state these axioms: 

action act ( [A] ,move (D) ) :- whiteplayer(A) ,direction(D) . 

action act ( [A] , throw(D,F) ) :- whiteplayer(A) ,direction(D) ,power(F) . 

action act ( [player (white ,1)] , whistle) . 
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where whiteplayer, power, and direction can be defined as follows: 

whiteplayer (player (white ,N) ) :- agent (player (white ,N) ) . 
power(l). power (2) . power(3). power (4) . power (5) . 

directioii(D) : - delta(D, _, _) . delta(nw, -1 , 1) . delta(n,0,l). delta(ne , 1 , 1) . 

delta(w,-l,0) . delta(e , 1 , 0) . delta(sw, -1 , -1) . delta(s , , -1) . delta(se , 1 , -1) . 

The executability of the actions is described by axioms of the form: 

executable x if C (5) 

where x € A and C is a constraint. The axiom states that x is executable 
only if C is entailed by the state of the world. We assume that at least one 
executability axiom is present for each action; multiple executability axioms are 
treated disjunctively. 

Example 3 In our working example, we can state executability as follows: 

executable act ( [player (white , 1)] .whistle) if [S eq 0] :- build_sum(S) . 
executable act ( [A] ,move (D) ) if [hasball(A) eq 0, def ense_time gt 0, 
Net It x(A)+DX, x(A)+DX leq MX, 1 leq y(A)+DY, y(A)+DY leq MY] :- 
action(act ( [A] ,move(D))) , delta(D,DX,DY) , 
net(Net), liiiex(MX), liney(MY) . 
executable act ( [A] ,throw(D,F) ) if 

[hasball(A) gt , def ense_time eq 0, 1 leq x(A)+DX*F, x(A)+DX*F leq MX, 
1 leq y(A)+DY*F, y(A)+DY*F leq MY] :- 

action(act([A] ,throw(D,F))) , delta(D,DX,DY) , linex(MX) , liney(MY) . 

These axioms state that neither a player nor the ball can leave the field. build_sum 
is recursively defined to return the expression: def eiise_time + hasball(Ai) + 
• • •+hasball(yl„) where Ai, An are the players (i.e., player (vhlte ,1) and 
player (black, 1) J. The operators =,7^,<,<, etc. are concretely represented 
by eq, neq, leq. It, respectively. □ 

The effects of an action execution are modeled by dynamic causal laws: 

X causes Eff if Prec (6) 

where x € A, Prec is a constraint, and Eff is a conjunction of basic primitive 
constraints. The axiom asserts that if Prec is true with respect to the current 
state, then Eff must hold after the execution of x. 

Since agents share fluents, their actions may interfere and cause inconsis- 
tencies. A conflict happens when the effects of different concurrent actions 
are incompatible and would lead to an inconsistent state; note that we allow 
only consistent states to exist during the evolution of the world. A procedure 
has to be applied to resolve a conflict and determine a consistent subset of the 



conflicting actions (see Sect. 3.3 1. 



Example 4 Let us describe the effects of the actions in the volleyball domain. 
When the ball is thrown with force f in direction d, it reaches a destination cell 
whose distance is as follows: a) if d is either north or south then AX = 0, = 
/," b) if d is east or west then AX = /, AV = 0; c) if d is any other direction. 
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AX = /, AK = /. An additional effect is to set the fluent def ense_time (to 1 
in our example). 

act ( [A] , thr ow (D , F) ) causes hasball(A) eq :- action(act ( [A] ,throw(D,F) ) ) . 
act ( [A] , thr ow (D , F) ) causes defense-time eq 1 :- action(act([A] ,throw(D,F))) . 
act([A] .throw (D,F)) causes pair (x(ball) ,y (ball) ) eq pair(x(A)-l+ F*DX,y(A)-l+ F*DY) :- 
action(act( [A] ,throw(D,F) ) ) , delta(D,DX,DY) . 

act( [A] ,throw(D,F) , causes hasball(B) eq 1 

if [pair(x(B) ,y(B)) eq pair(x(A)+F*DX, y(A)+F*DY)] :- 

action(act( [A] , throw(D , F) ) ) , player(B) , neq(A,B) ,delta(D,DX,DY) . 
act([A] ,throw(D,F)) causes point(black) eq 1 if [x(A)+F*DX eq Net] :- 
action(act([A] ,throw(D,F))), delta(D,DX,_) , net(Net). 

The effects of the other two actions move and whistle can be stated by: 

act ( [player (white, 1)] .whistle) causes point(white) eq 1 if [x(ball) It NET] :- 

net (NET) . 

act ( [player (white, 1)] .whistle) causes point(black) eq 1 if [NET It x(ball)] :- 

net (NET) . 

act ( [A] ,move (D) ) causes pair (x(A) ,y (A) ) eq pair (x(A)-l+DX.y(A)-l+DY) :- 

action(act( [A] ,move(D))) , delta(D,DX.DY) . 
act ( [A] .move (D) ) causes def ense_time eq def ense_time~^- 1 :- action(act( [A] .move(D))) . 
act ( [A] .move (D) ) causes hasball(A) eq 1 

if [pair(x(ball) ,y(ball)) eq pair(x(A)+DX,y(A)+DY)] :- 

action(act([A] .move(D))). delta(D.DX.DY) . □ 

In presence of a conflict (i.e., two agents executing actions that assign a distinct 
value to the same fluent), at least two perspectives can be followed, by assigning 
either a passive or an active role to the conflicting agents. In the first case, a 
supervising entity is in charge of resolving the conflict, and all the agents will 
comply with the supervisor's decisions. Alternatively, the agents themselves are 
in charge of reaching an agreement, possibly through negotiation. In the latter 
case, the following declarations allow one to specify in the action theories some 
basic reaction policies the agents might apply: 



action x [OPT] (7) 

with OPT defined as: OPT ::= on_conf lict DC [OPT] 

I on_failure OF [OPT] 
OC ::= retry.after T [provided C] 

I forego [provided d 
OF ::= retry_after T [if C] 

I replan [if C] [add_goal C] 

I fail [if C] 

where T is a number of steps and (7 is a constraint. Notice that one can also 
specify policies to be adopted whenever a failure occurs in executing an action. 

We remark here the difference between conflict and failure. A conflict occurs 
whenever concurrent actions performed by different agents try to make incon- 
sistent modifications to the state of the world. A failure occurs whenever an 
action x cannot be executed as planned by an agent a. This might happen, for 
instance, because after the detection of a conflict involving x, the outcome of 
the conflict resolution phase requires x to be inhibited. In this case the agent a 
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might have to reconsider its plan. Hence, reacting to a failure is a "local" activ- 
ity the agent might perform after the state transition has been completed. In 
axioms of the form Q, one can specify different reactions to a conflict (resp. a 
failure) of the same action. Alternatives will be considered in their order of 
appearance. 

Example 5 Let us assume that the agents a and b have priority 0, while agent 
c has lower priority 2. Let us also assume that the current state is such that 
actions act_a, act_b, and act_c are all executable (respectively, by agents a, b, 
and c), where their effects on fluent f are of setting it to 1, 2, and 3, respectively. 
This indicates a situation of conflict, since the effects of the concurrent execution 
of the three actions are inconsistent. Assume that the following options have 
been defined: 

action act_a on_conflict retry_after 2 

action act_b on_conflict forego 

action act_c on_failure retry_after 3 
and that the plan of agent a (resp., b, c) requires the execution of action act_a 
(resp., act_b, act_cj in the current state. One possible conflict resolution is 
to focus the priority of the agents. This causes act_c to be removed from the 
execution list. Thus, agent c fails in executing act_c and will retry the same 
action after 3 steps. 

Some policy must be now chosen to resolve the conflict between a and b. 
The first possibility is that agents have passive roles in conflict resolution, and 
a supervisor selects, according to some criteria, a consistent subset of the ac- 
tions/agents. For example, if a is selected (e.g., by lexicographic order), then 
the state will be modified by setting f ^ 1, declaring act_a successful, while 
agent b will fail. 

An alternative is to allow the agents a and b to directly resolve the conflict, 
using their on_conflict options. This causes a to retry the execution o/act_a 
after 2 time steps and b to forego the execution o/act_b. Both of them will get 
a failure message, because neither act_a nor act_b are executed. □ 

Apart from the possible communications occurring among agents during the 
conflict resolution phase, other forms of "planned" communication can be mod- 
eled in an action theory. An axiom of this form 

request C'l if C2 (8) 

describes a special static causal law that allows an agent to broadcast a request, 
whenever a certain condition (C2) is encountered. By executing this action, an 
agent asks if there is another agent that can make the constraint Ci true. Only 
an agent knowing all of the fluents occurring in Ci is allowed to answer such 
request. 

Instead of broadcasting an help request, an agent a can send such a message 
directly to another agent by providing its namej^ 

request Ci to_agent a' if C2 (9) 

^Any request sent to a nonexistent agent will never receive an answer. 
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The following communication primitive subsumes the previous ones: 



request Ci [ to_agent a'] if C2 [ offering C3 ] 



(10) 



If the last option is used, the requesting agent also provides a "reward" by 
promising to ensure C3 in case of acceptance of the proposal. Axioms of this type 
allow us to model negotiations and other forms of bargaining and transactions. 

In turn, agents may declare their willingness to accept requests and serve 
other agents using statements of the form 



where Agent-List is either a list of agent names ai, . . . ,ak — denoting that the 
agent in question can serve requests coming from the agents ai, . . . , — or the 
keyword all — denoting the fact that the agent can accept requests coming from 
any source. The optional condition allows the agent to select which requests to 
consider depending on properties of the current state of the world. 

Example 6 Let us consider a domain with three agents: a guitar maker, a 
joiner that provides wooden parts of guitars (bodies and necks), and a seller that 
sells strings and pickups. We assume that the maker has plenty of money (so 
we do not take into account what it spends), that the seller wants to be paid 
for its materials, and that necks and bodies can be obtained for free (e.g., the 
joiner has a fixed salary paid by the maker). The income of the seller is modeled 
by changes to the value of the fluent seller_account. In Figure^ we report 
an action description that models the agent guitar jnaier — analogous theories 
can be formulated for the other two agents. Observe that two point-to-point 
interactions are modeled — namely, the one between the guitar jnaker and the 
joiner, to obtain necks and bodies, and the one between the guitar jnaker and 
the seller, to buy .strings ($8) and pickups ($60). Two kind of guitars can be 
made, differing in the number of pickups. □ 

Various forms of global constraint can be exploited to impose control knowl- 
edge and maintenance goals. These constraints represent properties that must 
always persist in the world where the agents act. Some examples: 

• FC holds_at n. This constraint is satisfied if the fluent constraint FC 
holds at the n*'* time step. 

• always FC. This constraint imposes the condition that the fluent con- 
straint FC holds in all the states of the evolution of the world. 

Semantics of these constraints is reported in Section [3. 1| 

An action domain description consists of a collection T>a of axioms of the 
forms described so far, for each agent a € Q. Moreover it includes, for each agent 
a, a collection Oa of goal axioms (objectives), of the form goal C, where C is a 
constraint, and a collection la of initial state axioms of the form: initially C, 
where C is a constraint involving only timeless expressions. For the sake of 



help Agent_List [ if C] 



(11) 
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agent guitar jnaker. 
action make_guitar. 

executable make_guitar if neck > and strings >— 6 and body > cind pickup > 0. 
7o actions for making two different kinds of guitars; 

make.guitar causes guitars=guitars~ +1 and neck=neck~ -1 cind body=body~ -1 
and strings — strings""*^ — 6 and pickup — pickup""'^ — 2 
if pickup >— 2 . 

make_guitcir causes guitars=guitars~"'^+l cind neck=neck~ "'^ -1 and strings — strings""'^ — 6 
and body=body ~ - 1 cind pickup=pi ckup ~ - 1 
if pickup < 2 . 

y, interaction with joiner: 

request neck > to_agent joiner if neck — 0. 
request body > to_agent joiner if body — 0. 

% interaction with seller: 

request strings > 5 to.agent seller if strings < 6 

offering seller _account — seller_account~"'^ + S. 
request pickup > to_agent seller if pickup — 

offering seller.account — seller.account^^ + 60. 

% the goal is to make 10 guitars: 
goal guitars — 10. 

y, initially the maker owns some material: 

initially guitars — 2 and body — 3 aind neck — 5 and pickup — 6 cind strings — 24. 



Figure 1: An action description in B^^^ for a guitar maker agent 



simplicity, we assume that all the sets Xa are drawn from a consistent global 
initial state description Z, i.e., C Z. A specific instance of a planning problem 
is a triple 



3 System behavior 

The behavior of B^^^ can be split into two parts: the semantics of the action 
description language, parametric on the supervisor selection strategy, and these 



strategies that can be programmed. We present the former in Section 3.1 the 
latter in Sections |3.2| - |3.5[ Finally, some implementation notes are reported in 
Section [Q 



3.1 Semantics of S""" 

The semantics of the action language is described by a transition function that 
operates on states. A state s is identified by a total function v : T — > V. We 
assume a given horizon N, within which the planning activities of all agents 
have to be completed. 

Let V = {vQ^...jVi) be a state sequence, with < z < N. Given j G 
{0, . . . , z}, and a fluent expression (p, we define the concept of value of in at 
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time j, denoted by v{j, (p), as follows: 



v{j, x) = X if .T e V 

v{j,f) = v,+t{f) if /e-F, andO< j + i 

v{j,f) = voif) if /e and j + t<0 

v{j, abs (lp)) = \v{j,f)\ 

v{j,ipi®(p2) = v{j,(pi)®v{j,(p2) 
v{j,Tei{C)) = 1 if^JhC 
v{j,re±{C)) = ifv^jC 

where © G {+, — , /.mod}. The last two cases specify the semantics of rcifi- 
cation that relies on the notion of satisfaction, which in turn is defined by 
structural induction on constrains, as follows. Given a primitive constraint 
ipi op (/?2 and a state sequence v, the notion of satisfaction at time j is defined 
as: V \=j tpi op (p2 iff v{j,tpi) op v{j,(p2)- The notion \=j is generalized to the 
case of propositional combinations of fluent constraints in the usual manner. 
For the case of pair, we have that v \=j pair(£'i, £^3) = pair (£^2, -£'4) if and 
only if V \=j Ei = E2 A E3 = E4. 

We recall that a timeless fluent is a fluent expression of the form /° (and 

/)• 

Given a constraint C and a state sequence v = {vq, . . . ,Vi), let fluents(C) be 
the set of timeless fluents occurring in C. A function a : fluents(C) — > V is 
a v-solution of C if {vq, ■ ■ ■ ,Vi,a) |=i+i C. Let us observe that this definition 
makes use of a slight abuse of notation, since a is potentially not a complete 
state (some fluents may have not been assigned a value by a) . Nevertheless, the 
choice of fluents in fluents(C) guarantees the possibility of correctly evaluating 
C. In other words, a can be seen as a partial state contributing (with v) to 
the satisfaction of C at time i + Let us see how to complete this state using 
inertia: if cr is a t7-solution of a constraint C, ine(c7, v) is defined as follows: 



ine(cr, ■!;)(/) 



(^if) if / e fluents(C) 
Vi{f) otherwise 



Fluents not appearing in C are considered inertial (namely they maintain their 
previous values) and therefore the state is completed using the function ine. 

An action x is executable by agent a in a state sequence v = {vq, . . . ,Vi) if 
there is at least an axiom executable a; if C in Va and it holds that v \=i C . 
If there is more than one executability condition, it is sufficient for one of them 
to apply. 

Let us denote with Dyn(x) the set of dynamic causal law axioms for action 
X. The desired effect of executing x in state sequence v = {vq, . . . ,Vi), denoted 
by DEff (a;,tT), is a constraint defined as follows: 

DEf f (x, = {Ef f I a; causes Eff if Prec G Dyn{x) , v \=i Prec} . 

Request accomplishment actions can be used in the construction of this set. 
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Given a state sequence v — (wq, • ■ • , f »), a state Wi+i, and a set of actions X, 
a triple {v^X^Vi+i) is a valid state transition if: 

• for all a; G X, the action x is executable in w by some agent a, and 

• Wi+i = ine(cr, ?;), where ct is a w-solution of the constraint /\^^-^'QE±f{x, v). 

Observe that if X = 0, then (w, 0, Vi) will be a valid state transition. 

Let V = (vo, . . . ,wn) be a sequence of states, ((2?a)aG^, (2^a)aGy^, (Ca)ae^) 
an instance of a planning problem, and Xi, . . . , X-^ be sets of actions. We say 
that (uo, Xi^ wi, . . . , Xn, dn) is a waZid trajectory if: 

• for each agent a and for each axiom of the form initially C in 1^, we 

have that v |=o C'j 

• for all i e {0, . . . , N — 1}, ((wq, ■ ■ • , Wi), X^+i, ti^+i) is a valid state transi- 
tion. 

A valid trajectory is successful for an agent a if, for each axiom of the form 
goal C in Oa, it holds that v |=n C . 

At each time step i, each agent might propose a set of actions for execution — 
we assume that all the proposed actions are executable in the state sequence 
^'i = {vo^ ■ ■ ■ ,Vi)- Let y^+i be this set of actions. The supervisor selects a subset 
Xi+i C Yi+i such that the constraint Eff{Xi^i,Vi), defined as: 

Eff(X,+i,'(/,) = /\ DEff{x,v,) 

is satisfiable w.r.t. v — i.e., there exists a complete state Vi+i such that {vi, Xi+i, Vi 
is a valid state transition. It is the job of the supervisor to determine the subset 
X j+i given y^+i and Vi — as a maximal consistent subset, using agent priorities 
or other approaches, as discussed in Section [331 If agent cannot find a plan 
at the time step i it will ask for a nop and try again the next step. 

Let us complete the semantics of the language by dealing with request and 
help laws. A request of the agent a 

request Ci to_agent a' if C2 

is executable in a state sequence v = {vq, . . . ,Vi) if it holds that v C2. If the 
request above is executable, it can be accomplished in the successive state w^+i 
if there is an axiom 

help ■ ■ ■ a ■ ■ ■ if C3 

in Va' and (wq, . . . , Wi, fi+i) \=i+i C3. The semantics of the help law is that 
of enabling a request accomplishment (after a request demand) and it can be 
viewed as the execution of an ordinary action by agent a'j^ We can view this 
as if a' had an additional action y defined in Va' as: 

executable y if C3 A (6*2)"^ 
y causes Ci if true 

^We hypothetically assume that a' has access to all fluents of a. 
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Observe that, as happens for executabihty laws, multiple help preconditions are 
considered disjunctively. If the request includes also the option offering C4, 
then the action y will cause C\ A C4 as effect. 

Let us add some comments on agents' requests for action execution. Each 
agent wishes to execute some actions and to ask some requests. After the 
supervisor has decided which actions will be executed, each agent retrieves the 
relevant requests and analyzes them in order to possibly fulfill them in the next 
time step (see below for further details). These requests behave like an action 
y, as stated above. 

Two global constraints are allowed by the language Z?**"^. Their effect is to 
filter out sequences of states that do not fulfill those constraints: 

• C holds_at i imposes that any valid trajectory {vq, Xi,vi, . . . , XjsijVn) 
must satisfy {vq, vi,. . . , wn) \=i C 

• always C imposes that any valid trajectory {vq, Xi,vi,. . . , Xn, v-^^) must 
satisfy {vq, ui, . . . , vn) \=i C for alH G {0, . . . , N}. 

The siipervisor is in charge of checking if these constraints can be satisfied while 
selecting Xi as mentioned before. If the fluents involved in the constraints 
are all known to an agent a, the set of actions proposed by a are such that 
they will guarantee the property if all of them (and only them) are selected for 
application. 

Each agent a, at each time step i, selects a set of actions Y°j^^ it wishes to 
execute. For doing that, a looks for a sequence of (sets of) actions to achieve 
its local goal, given the current state seriuence (uq, • . • The set of actions 
y^^i are those to be executed at the current time step. If the new state Wj+i 
communicated by the supervisor is different from the state it expected after the 
application of all the actions in the set Y°j^-^^ (due either to the fact that some of 
these actions are not selected, or that other agents have executed actions that 
have unexpectedly changed some values), it will need to replan. Let us observe 
that, although globally the supervisor views a valid trajectory, locally this is not 
true (some state transitions are not justified by the actions of agent a alone). 
However, in looking for a plan (and in replanning), it reasons on an "internal" 
valid trajectory from the current time to the future. 

Let us focus on the problem of reacting to requests. Suppose that an agent 
a', at time Hn a state sequence {vq, Vi,. . . ,Vi), receives the requests n,. . . , rh, 
where rj is of the form 

request C{ to_agent a' if 

and, moreover, assume that these requests are ordered (e.g., by the priorities of 
the requesting agent aj). For j = 1, . . . , /i, if Va' contains an axiom 

help ■ ■ ■ Qj ■ ■ ■ if C3 

such that {vo, vi, . . . ,Vi) \=i C| , the agent a' adds temporarily to its theory the 
constraint 

C( holds-at i + 1 (12) 



12 



and looks for a plan in the enlarged theory. If such a plan exists, the constraint 



(12) is definitely stored in P^', otherwise the request is ignored. In both cases, 
a' proceeds with next request (j J + 1)- At the end, some (possibly none) 
of the h constraints Cl, . . . ,Cj^ will be fulfilled by a plan and the set of actions 
^a'^^ of the next step of this plan are passed to the supervisor. 

Let us focus now on how the agent a deals with the options related to a failure 



(this is also developed in Section 3.4 1. Let us assume an action x submitted 
for execution at time i has not been selected by the supervisor, and, therefore, 
a failure signal is returned to the agent a. The current sequence of states is 
V = {vo,vi,...,Vi+i). 

Let us analyze what happens in the three options: 

• fail if Ci: if v \=i+i Ci then agent a declares its failure. From this 
point onwards, the agent will not generate any actions, nor interact with 
other agents. 



replEin if Ci add_goal C2: if w h=i+i Ci then goal C2 is added in Da 
(and then the agent a starts replanning) 



• retry _after T if Ci. ii v Ci then for T — 1 time steps the agent 

a requires only nop to the supervisor, at time step T + i the action x is 
required again. 

If the if option is missing, the condition will be assumed to be satisfied. If 
the add_goal option is missing, no new goal will be added. 



3.2 Concurrent plan execution 

The agents arc autonomous and develop their activities independently, except 
for the execution of the act ions /plans. In executing their plans, the agents must 
take into account the effects of concurrent actions. 

We developed the basic communication mechanism among agents by ex- 
ploiting a tuple space, whose access and manipulation follows the blackboard 
principles introduced in the Linda model [5 . Linda is a popular model for co- 
ordination and communication among processes; Linda offers coordination via 
a shared memory, commonly referred to as a blackboard or tuple-space. All 
the information are stored in the blackboard in the form of tuples — the shared 
blackboard provides atomic access and associative memory behavior (in retriev- 
ing and removing tuples) . The SICStus Prolog implementation of Linda allows 
the definition of a server process, in charge of managing the blackboard, and 
client processes, that can add tuples (using the out operation), read tuples 
(using the rd operation) and remove tuples (using the in operation). 

Most of the interactions among concurrent agents, especially those interac- 
tions aimed at resolving conflicts, are managed by a specific process, the super- 
visor, that also provides a global time to all agents, enabling them to execute 
their actions synchronously. The supervisor process stores the initial state and 
the changes caused by the successful executions of actions. It synchronizes the 
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actions execution, and controls the coordination and the arbitration in case of 
conflicts. It also sends a success or a failure signal to each agent at each action 
execution attempt, together with the list of changes to its local state. 

Let lis describe how the execution of concurrent plans proceeds. As men- 
tioned, each action description includes a set of constraints describing a portion 
of the initial state. 

1. At the beginning, the supervisor acquires the specification I = [J^^gla 
of the initial state. 

2. At each time step the supervisor starts a new state transition: 

• Each agent sends to the supervisor a request to perform an action i.e., 
the next action of its locally computed plan — by specifying its effects 
on the (local) state. 

• The supervisor collects all these requests and starts an analysis, aimed 
at determining the subsets of actions/agents that conflict (if any). A 
conflict occurs whenever agents require incompatible assignments of 
values to the same fluents. The transition takes place once all conflicts 
have been resolved and a subset of compatible actions has been iden- 
tified by means of some policy (see below). These actions are enabled 
while the remaining ones are inhibited. 

• All the enabled actions are executed, producing changes to the global 
state. 

• These changes are then sent back to all agents, to achieve the corre- 
sponding updates of each agent's local state. All agents are also notified 
about the outcome of the procedure. In particular, those agents whose 
actions have been inhibited receive a failure message. 

3. The computation stops when the time N is reached. 

Observe that, after each step of the local plan execution, each agent needs to 
check if the reached state still supports its successive planned actions. If not, 
the agent has to reason locally and revise its plan, i.e., initiate a replanning 
phase. This is due to the fact that the reached state might be different from 
the expected one. This may occur in two cases: 

1. The proposed action was inhibited, so the agent actually executed a nop; 
this case occurs when the agent receives a failure message from the super- 
visor. 

2. The interaction was successful, i.e., the planned action was executed, but 
the effects of the actions performed by other agents affected fluents in its 
local state, preventing the successful continuation of the remaining part 
of the local plan. For instance, the agent a may have assumed that the 
fluent g maintained its value by inertia, but another agent, say b, changed 
such value. There is no direct conflict between the actions of a and b, but 
agent a has to verify that the rest of its plan is still applicable (e.g., the 
next action in a's plan may have lost its executability condition). 
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3.3 Conflict resolution 



A conflict resolution procedure is invoked by the supervisor whenever it deter- 
mines the presence of a set of incompatible actions. Different policies can be 
adopted in this phase and different roles can be played by the supervisor. 

First of all, the supervisor exploits the priorities of the agents to attempt a 
resolution of the conflict, by inhibiting the actions issued by low priority agents. 
If this does not suffice, further options are applied. We describe here some of the 
easiest viable possibilities, that we have already implemented in our prototype. 



The architecture of the system is modular (see Sect. 3.7), and can be easily 
extended to include more complex policies and protocols. 

The two approaches we implemented so far differ by assigning the active 
role in resolving the conflict either (a) to the supervisor or (b ) to the conflicting 
agents. 

In the first case, the supervisor has an active role — it acts as a referee and 
decides, without any further interaction with the agents, which actions have to 
be inhibited. In the current prototype, the arbitration strategy is limited to: 

• A random selection of a single action to be executed; or 

• The computation of a maximal set of compatible actions to be executed. 
This computation is done by solving a CSP — which is dynamically gener- 
ated using a CLP(J^I?) encoding. 

Note that, in this strategy, the on_conflict policies assigned to actions by 
axioms ([t]) are ignored. This "centralized" approach is relatively simple; it has 
also strong potential of facilitating the creation of optimal plans. On the other 
hand, the adoption of a centralized approach to conflict resolution might become 
a bottleneck in the system, since all conflicting agents must wait for supervisor's 
decisions. 

In the second case, the supervisor simply notifles the set of conflicting agents 
about the inconsistency of their actions. The agents involved in the conflict 
are completely in charge of resolving it by means of a negotiation phase. The 
supervisor waits for a solution from the agents. In solving the conflict, each 
agent a makes use of one of the on_conf lict directives ([t]) specified for its 
conflicting action x. The semantics of these directives are as follows (in all the 
cases [provided C] is an optional qualifler; if it is omitted it is interpreted as 
provided true): 

• The option on_conf lict forego provided C causes the agent a to "search" 
among the other conflicting agents for someone, say 6, that can guarantee 
the condition C . In this case, 6 performs its action while the execution 
of a's action fails, and a executes a nop in place of its action x. Differ- 
ent strategies can be implemented in order to perform such a "search for 
help" . A simple one is the round-robin policy described below, but many 
other alternatives are possible and should be considered in completing the 
prototype. 



15 



• The option on_conflict retry_after T provided C, will cause a to 
execute nop during the following T time steps and then it will try again 
to execute its action (if the preconditions still hold). 

• If there is no applicable option (e.g., no option is defined or none of the 
agents accept to guarantee C), the action is inhibited and its execution 
fails. 

The way in which agents negotiate and exploit the on_conf lict options can 
rely on several protocols, of different complexity. For instance, one possibility 
might be to nominate a "leader" within each of the conflicting sets S of agents. 
The leader is in charge of coordinating the agents in S to resolve the conflict 
without interacting with the supervisor. 

Another approach consists of letting c;;u:li agent in S free to proceed and 
to find an agreement by sending proposals to other agents (possibly by adopt- 
ing some order of execution, some priorities, etc.) and receiving their propos- 
als/answers. In the current prototype, we implemented a round-robin policy. 
Let us assume that the state sequence already constructed is v = {vq, . . . ,?,',) 
and let us assume that the agents in the list A= (ai, . . . , 0^) aim at executing 
the set of actions Y = (yi, . . . ,ym), respectively. Furthermore, let us assume 
that the execution of all actions in Y will introduce a constraint that doc;s not 
have a v-solution. There is a sorting of the agents, and they take turn in resolv- 
ing the conflict. Suppose that at a certain round of the procedure the agent ak 
is selected, tries its next unexplored on_conf lict OP provided C option 
for its action and checks if v \=i C. 

• li V \=i C then Uk will apply the OP option and Uk and yk are removed 
from A and Y, respectively. 

• Otherwise, the next agent is selected and the successive call to Uk will 
consider the next on_conf lict option. 

If there are no successive options for Uk then ak,yk will be removed from A,Y 
and a failure for will occur. After each step, if Y has a w-solution, then the 
procedure will terminate and the actions in Y will be executed. Observe that 
this procedure always terminates with a solution to the conflict, since a flnite 
number of on_conf lict options are defined for each action. 

This a relatively rigid policy, and it represents a simple example of how to 
realize a terminating protocol for conflict resolution. Alternative solutions can 
be added to the prototype thanks to its modularity. 

Once all conflicts have been addressed, the supervisor applies the enabled 
actions, and obtains the new global state. Each agent receives a communication 
containing the outcome of its action execution and the changes to its local 
state. Moreover, further information might be sent to the participating agents, 
depending on the outcome of the coordination procedure. For instance, when 
two agents agree on an on_conf lict option, they "promise" to execute speciflc 
actions (e.g., the fact that one agent has to execute T consecutive nop). 
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3.4 Failure policies 



Agents receive a failure message from the supervisor whenever their requested 
actions have been inhibited. In such a case, the original plan of the agent has to 
be revised to detect if the local goal can still be reached, possibly by replanning. 
Also in this case different approaches can be applied. For instance, one agent 
could avoid developing an entire plan at each step, but limit itself to produce 
a partial plan for the very next step. Alternatively, an agent could attempt to 
determine the "minimal" modifications to the existing plan in order to make it 
valid with respect to the new encountered statej^ 

In this replanning phase, the agent can exploit the on_f ailure options as- 
sociated to the corresponding inhibited action. The intuitive semantics of these 
options can be described as follows. 

• retry _after T [if C] : the agent first evaluates the constraint C; if C 
holds, then it executes the action nop T times and then tries again the 
failed action (provided that its executability conditions still hold). 

• replan [if Ci] [add_goal C2] : the agent first evaluates Ci ; if it holds, 
then in the following replanning phase the goal C2 is added to the current 
local goal. The option add_goal C2 is optional; if it is not present then 
nothing is added to the goal, i.e., it is the same as add_goal true. 

• fail [if Ci] : this is analogous to replan [if Ci\ add_goal false. 

In this case the agent declares that it is impossible to reach its goal. It 
quits and does not participate to the subsequent steps of the concurrent 
plan execution. 

• If none of the above options is applicable, then the agent will proceed as 
if the option replsin if true is present. 

All the options declared for the inhibited action are considered in the given 
order, executing the first applicable one. 

It might be the case that some global constraints (such as holds_at and 
always, cf.. Sect. [2]) involve fluents that are not known by any of the agents. 
Therefore, none of the agents can consider such constraints while planning. 
Consequently, these constraints have to be enforced while merging the individual 
plans. In doing this, the supervisor adopts the same strategies introduced to 
deal with conflicts and failures among actions, as described earlier. Namely, 
whenever a global constraint would be violated by the concurrent execution of 
actions (taken from different agents' plans) a conflict is generated and a conflict 
resolution procedure executed. Thus, some of the conflicting actions will be 
inhibited causing their failure. 



3.5 Broadcasting and direct requests 

Let us describe a simple protocol for implementing the point-to-point and broad- 
cast communications among agents, following an explicit request of the form ( 10 ). 



'At this time, the prototype includes only replanning from scratch at each step. 
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In particular, let us assume that the current state is the i-ih one of the plan 
execution — hence, the supervisor is coordinating the transition to the (?+l)- 
th state by executing the (i+l)-th action of each local plan. The handling of 
requests is interleaved with the agent-supervisor interactions that realize plan 
execution; nevertheless, the supervisor does not intervene on the requests, and 
the requests and offers are directly exchanged among agents. We can sketch the 
main steps involved in a state transition, from the point of view of an agent a, 
as follows: 

(1) Agent a tries to execute its action and sends this information to the su- 



pervisor (Sect. 3.2 1 



(2) Possibly after a coordination phase, a receives from the supervisor the out- 
come of its attempt to execute the action (failure or success, the changes 
in the state, etc.) 

(3) If the action execution is successful, before declaring the current transi- 
tion completed, the agent a starts an interaction with the other agents 
to handle pending requests. All the communications associated to such 
interactions are realized using Linda's tuple-space (requests and offers are 
posted and retrieved by agents). 

(3. a) Agent a fetches the collection H of all the requests still pending and 
generated until step i. For each request of help h € H, originating 
from some agent 6, agent a decides whether to accept h or not. Such 
a decision might involve planning activities, in order to determine if 
the requested condition can be achieved by a, possibly by modifying 
its original plan. In the positive case, a posts its offer into the tuple- 
space and waits for a rendez-vous with b. 

(3.b) Agent a checks whether there are replies to the requests it previously 
posted. For each request for which replies are available, a collects the 
set of offers/agents that expressed their willingness to help a. By us- 
ing some strategy, a selects one of the responding agents, say b. The 
policy for choosing the responding agent can be programmed (e.g., 
by exploiting priorities, agent's knowledge on other agents, random 
selection, trust criteria, utility and optimality considerations). Once 
the choice has been made, a establishes a rendez-vous with the se- 
lected agent and 

• declares its availability to b, 

• communicates the fulfillment of the request to the other agents. 
The request and the obsolete offers are removed from the tuple space. 

(4) At that point in time, the transition can be considered completed for the 
agent a. By taking into account the information about the outcome of the 
coordination phase in solving conflicts (point (2)), the agreement reached 
in handling requests (point (3)), a might need to modify its plan. If the 
replanning phase succeeds, then a will proceed with the execution of the 
next action in its local plan. 
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Note that we provided separated descriptions for steps (3. a) and (3.b). In a 
concrete implementation, these two steps have to be executed in an interleaved 
manner, to avoid that a fixed order in sending requests and ofi^ers causes dead- 
locks or starvation. Furthermore, if an agent fails in executing an action, then 
it will skip the step (3) and proceed with step (4) in order to re-plan its activity. 
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Figure 2: The dependencies between modules in the system. The modules' 
names recall the corresponding Prolog-files names. The module runner is the 
starter of the application. The module settings specifies user options (policies, 
strategies, etc.) and the sources files containing the action descriptions, it is 
imported by all the others (we omitted drawing the corresponding arcs, as well 
as the nodes relative to less relevant SICStus libraries). 



3.6 The languages B"^% and 5"" 

The language S**'^, and its implementation, heavily relies on its foundations 
^MAP B^^ . In this section we briefly compare these three languages to 
clarify which parts of the solvers of the previous languages can be used for the 



implementation of yS***^ presented in Subsection 3.7 

Let us focus first on B"^. This is a single agent framework. Therefore, 
considering a given action theory, all fluents and actions are known to the single 
agent, and the language does not permit to specify private fluents or actions. 
Moreover, _B"^ allows one to specify static causal laws. The syntax of fluent 
expressions and constraints is exactly the same as in S*'^^. The syntax for 
executability and action effects is analogous to that of S"*^. More precisely, in 
i?"^, these laws take the forms: 

• exectuable(a,C) 

• causes(a::,Ci,C2), where Ci is the constraint that will hold in the next 
state if the action x is executed in a state where C2 holds. 
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These are just syntactical variants of ^ and ([6]), respectively. The semantics 
of is given via a transition system analogous to that introduced for B***^. 
In particular, one might note that if a S***^ action description involves a single 
agent that knows all the fluents (and no communication laws are included), then 
its semantics coincides with the one of the corresponding B""^ program obtained 
by an immediat syntactical translation. The Prolog interpreter for is proved 
to be correct and complete (for soundness the absence of static laws is needed, 
but this is the case of i?**'^, as presented here) with respect to the semantics 
in [Hi- 
Let us consider now B"'^''. It is a multiagent, centralized language, where 
collective actions, namely actions that require more than one agent for being 
executed, are allowed. For instance, a law of the form 

action x executable by ai, 02, . . . , a„ 

specifies that agents oi, 02, . . . , may execute together the action x. In S"*^, 
instead, in the domain of an agent a, an action definition implicitly states that 
the action is executed by a (hence, this is a particular case of the B'^'^'' law). On 
the other hand, since the reasoner is centralized, conflicts among effects never 
occur and all (concomitant) planned actions are always successfully executed. 
The declaration of fluents in B"'^'' is analogous to that in /B"*^, whereas B"'^'' 
has a different syntax for dynamic laws, since they can refer directly to action- 
occurrences. A B"'^'' dynamic law has the form Prec causes Eff, where Prec 
and Eff are constraints and at least one reference to an action x must explicitly 
occur in Prec. Such references are specified by exploiting action flags of the 
form actocc (x) . 

The semantics of B"'^'' is given via the same notion of transition system used 
for i?"^ and for S***^. If a multi-agent action description in /B"*^, together with 
initial state and goal, is such that during the plan, no conflict occurs, then the 
B"*'' action description obtained by a simple (mostly one-to-one) translation, 
has exactly the same behaviour on the transition system. Let us observe that 
in this translation, collective actions are not generated. 

3.7 Implementation issues 

A first prototype of the system has been implemented in SICStus Prolog, us- 
ing the libraries clpf d for agents reasoning (by exploiting the interpreters for 
Action Description Languages described in [inilll]), and the libraries system, 
linda/server, and linda/client for handling process communication. 

The system is structured in modules. Figure |2] displays the modules compos- 
ing the Prolog prototype and their dependencies. The modules spaceServer 
(via lindaServer) and lindaClient implement the interfaces with the Linda 
tuple-space. These modules support all the communications among agents. 

Each autonomous agent corresponds to an instance of the module plan_executor, 
which, in turn, relies on a planner (the module sicsplcin/bmap in Figure [2| 
for planning/replanning activities, and on client for interacting with other 
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agents in the system. As explained previously, a large part of the coordination 
is guided by the module supervisor. Notice that both the supervisor and 
client act as Linda-clients. Conflict resolution functionalities are provided to 
the modules client and supervisor by the modules Conf lictSolver_client 
and Conf lictSolver_super, respectively. Finally, the arbitration_opt mod- 
ule implements the arbitration protocol(s). In the current code distribution, we 
provide an arbitration strategy that maximizes the number of actions performed 
at each step. 

Let us remark that all the policies exploited in coordination, arbitration, 
and conflict handling can be customized by simply providing a different imple- 
mentation of individual predicates exported by the corresponding modules. For 
instance, to implement a conflict resolution strategy different from the round- 
robin described earlier, it suffices to add to the system a new implementation 
of the module Conf lictSolver_super (and for Conf lictSolver_client, if the 
specific strategy requires an active role of the conflicting agents). Similar ex- 
tensions can be done for arbitration_opt. 

The system execution is rooted in the server process runner — written either 
for Linux (.sh) or for Windows (.bat) platforms, in charge of generating the 
connection address that must be used by the client processes. 

The file settings.pl describes the planning problem to be solved. In par- 
ticular, the user must specify in this file, through Prolog facts, the number and 
the names of these files containing the action descriptions, a bound on the max- 
imum length of the plan, and the selected strategies for conflict resolution and 
arbitration (default choices can be used). 

As far as the reasoning/planning module is concerned, we slightly modified 
the interpreters of the iJ*^^ and the B^^^ languages [ini ITT] to accept the ex- 
tended syntax presented here. However, the system is open to further extensions 
and different planners (even not necessarily based on Prolog technology) can be 
easily integrated thanks to the simple interface with the module plan_executor, 
which consists of a few Prolog predicates. 

Currently, two planners have been integrated in the system: sicsplcoi is 
the constraint logic programming planner for the single-agent action language 
B^^; bmap is instead a constraint logic programming engine that supports 
centralized planning for multi-agent systems (capable, e.g., of collaborating in 
pursuing a common goal). Thus, the implementation allows each individual 
agent (according to the discussion from the previous sections) to be itself a 
complex system composed of multiple agents (operating in a cooperative fashion 
and planning in a centralized manner). 

To accommodate for this perspective, the design of the supervisor has been 
modified. The framework allows each concurrent planner that executes a multiple- 
action step, to specify the desired granularity of the conflict resolution phase. 
This is done by specifying (for each step in a plan) a partition of the set of ac- 
tions composing the step into those subsets of actions that have to be considered 
independently and as a whole. 

For instance, in the next section we describe a specification of a coordination 
problem between two multi-agent systems. Each multi-agent system develops 
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a plan in a centralized manner. Each step of such plans consists of a set of, 
possibly complex, actions (instead of a single action, as happens for the planner 
sicsplan). The conflicts between the multi-agent plans occurring during the 
(z)-th state transition are identified/resolved by considering a single action of 
each (i)-th step proposed by each planner. 

Let us make some considerations about the soundness of the implementation. 
Let us consider one step i + 1 in the construction of the trajectory. The state 
sequence already constructed is v = {vq, ■ ■ ■ ,Vi). The agents propose some 
actions for execution; the overall set of all actions proposed by all agents is 
Fi+i = {ui, . . . ,?/fc}. Agents propose for execution actions that are executable 
in V. At the implementation level, the soundness property is guaranteed by the 



correctness of the sicsplan/bmap module — see Section 3.6 

Let us denote with C{yj) the constraint that captures the effects of action 
yj] i.e., if the action yj has dynamic causal laws yj causes Er if Pr for r = 
1, . . . , m, then 

ni 

C{yj) = /\Pr^ Er. 
r=l 

Let A{yj) be a Boolean variable, intuitively denoting whether the supervisor 
has selected action yj for execution at time i + 1. 

The arbitration_opt implements an arbitration protocol y^+i) pro- 
ducing a substitution for {A(yi), . . . ,^(j/fe)} such that the constraint 

k 

\<^{v,Y,+,){A{y,))^C{y,) 

has a "(T-solution a. 

For example, in the current code distribution, the protocol $ is defined as a 
substitution that maximizes X]^=i ^{Vj)- 

From these definitions and from the properties of sicsplan/bmap, we have 
that {v, {yj I j G {1, . . . , fc}, <i>(t/, Yi^i)(iij) = 1}, ine(i7, v)) is a valid state tran- 
sition. 

If the conflict resolution is left to the agents, then the protocol <i> is the 
outcome of the conflict resolution procedure, e.g., the round-robin analysis of 



the conflicting actions described in Section 3.3 which is currently implemented. 
It is immediate to check that the round-robin procedure produces a protocol $ 
that satisfies the properties shown above. 

Due to the generality of the language for agent-based on-conflict resolu- 
tion, the correctness of any conflict resolution procedure must be independently 
proved. Correctness is not an immediate consequence of the language itself 
but is dependent on the specific on-conflict declaration are used in the specific 
procedure. 

3.8 The volleyball domain 

Let us describe a specification in of a coordination problem between two 
multi-agent systems — an extension of the domains described in Examples [lH4| 
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There are two teams: black and white whose objective is to score a point, i.e., 
to throw the bah in the field of the other team (passing over the net) in such 
a way that no player of the other team can reach the ball before it touches the 
ground. Each team is modeled as a multi-agent system that elaborates its own 
plan in a centralized manner (thus, each step in the plan consists of a set of 
actions) . 
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Figure 3: A representation of an execution of the volleyball domain 



The playing field is discretized by fixing a linex x liney rectangular grid 
that determines the positions where the players (and the ball) can move (see 
Fig. [3]). The leftmost (rightmost) cells are those of the black (white) team, while 
the net (x = 6) separates the two subfields. There are p players per team (p = 2 
in Fig. [3]) — concretely, the fact num(2) is added to the theory. The allowable 
actions are: move((i), throw(d, /), and whistle. During the defense time, the 
players can move to catch the ball and/or to re-position themselves on the court. 
When a player reaches the ball (s)he will have the ball and will throw the ball 
again. A team scores a point either if it throws the ball to a cell in the opposite 
subfield that is not reached by any player of the other team in the defense time, 
or if the opposite team throws the ball in the net. The captain (first player) of 
each team is in charge of checking if a point has been scored. In this case, (s)he 
whistles. 

Each team (either black or white) is modeled as a centralized multi-agent 
system, which acts as a singe agent in the interaction with the other team. Al- 
ternative options in modeling are also possible — for instance, one could model 
each single player as an independent agent that develops its own plan and inter- 
acts with all other players. The two teams have the goal of scoring a point: 
goal (point (black) eq 1). for blacks and goal (point (white) eq 1). for 
whites. 

At the beginning of the execution every team has a winning strategy, devel- 
oped as a local plan; these are possibly revised after each play to accommodate 
for the new state of the world reached. An execution (as printed by the system) 
is reported in Fig. [s] for a plan length of 9. The symbol (respectively, Y) de- 
notes the white (respectively, black) players, Q (resp. X) denotes a white player 
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with the ball. The throw moves applied are: 

[player (black, 1) ] :throw(ne, 3) (time 1) [player(black,2)] :throw(se,3) (time 3) 
[player(white,l)] :throw(w,5) (time 5) [player(black, 1)] :throw(e,5) (time 7) 

Let US observe that, although it would be in principle possible for the white team 
to reach the ball and throw it within the time allowed, it would be impossible 
to score a point. Therefore, players prefer to avoid to perform any move. 

The complete description of the encoding of this domain is available at 
[http : //www . dimi . uniud . it/dovier/BAAC The repository includes also additional 
domains — e.g., a domain inspired by games involving one ball and two-goals, as 
found in soccer. Although the encoding might seem similar to that of volley- 
ball, the possibility of contact between two players makes this encoding more 
complex. Indeed, thanks to the fact that the net separates the two teams, in 
the volleyball domain rules like the following one suffice to avoid collisions: 

always(pair(x(A) ,y(A)) neq pair (x(B) ,y(B) ) ) :- 

A=player (black, N) ,B=player (black, M) , nuiii(N), num(M) , N<M. 

In a soccer world this is not true because only the supervisor can be aware, in 
advance, of possible contacts between different team players originating from 
concurrent actions. This generates interesting concurrency problems, e.g., con- 
cerning the ball possession after a contact. A simple way to address this problem 
consists in assigning a fluent to each field cell, whose value can be —1 (free), 
(resp., 1) if a white (resp. black) player is in the cell. The supervisor identifles 
a conflict when two opponent players move to the same cell, thus assigning to 
that fluent a different value. In this case, the supervisor arbitrarily enables one 
action, the other agent waits a turn to retry the action: 

action act ( [A] ,inove(D) ) oii_failure retry_after 1 on_conflict arbitrate :- 
agent (A), direction(D) . 

4 Conclusions and future work 

In this paper, we illustrated the design of a high-level action description language 
for the description of multi-agent domains. The language enables the descrip- 
tion of agents with individual goals operating in a shared environment. The 
agents can explicitly interact (by requesting help from other agents in achieving 
their own goals) and implicitly cooperate in resolving conflicts that may arise 
during execution of their individual plans. The main features of the framework 
we described in this paper have been realized into an implementation, based 
on SICStus Prolog. The implementation is fully distributed, and uses Linda 
to enable communication among agents. Such a prototype is currently being 
refined and extended with further features. 

There have been many agent programming languages such as the BDI agent 
programming AgentSpeak [19], (as implemented in Jason [3]), JADE [2] (and its 
extension Jadex 0]), ConGolog [9|, IMPACT [23^, 3APL 0, GOAL [H]. A good 
comparison of many of these languages can be found in [IT. The emphasis of 
the effort presented in this paper is to expand our original work on constraint- 
based modeling of agents based on action languages. The generalization to a 
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constraint-based multi-agent action language has been presented in piT . In this 
paper we demonstrate a further extension to encompass distributed reasoning 
and distributed planning. Thus, the focus of the proposal remains on the level 
of creating an action language and demonstrating the suitability of constraint- 
based technology to support it. As such, we do not propose here a new agent 
programming language, rather we push an action language perspective and how 
action languages scale to multi-agent domains; our work could be used as the 
underlying formalism for the development of new agent programming languages. 
In this sense, our proposal is different than many of the MAS development 
platforms, which focus on programming languages for MAS and on complex 
protocols for advertising and interaction among agents (e.g., FIPA). 

The choice of Linda came about for simplicity; we required the use of a 
CLP platform and SICStus provides support for both Linda and constraint 
handling — as few other distributed communication platforms (e.g., OAA [6j). 
In the long term, we envision mapping our agent design on a MAS infras- 
tructure that enables discovery and addition of agents, handles network-wide 
distribution of agents, mapping the exchange of constraints to a standard agent 
communication language (e.g., FIPA-ACL/FIPA-SL [H]). This will require a 
non-trivial engineering work, to map the reasoning with action languages (e.g., 
planning) to a platform that is not constraint-based — we are currently exploring 
the problem in the context of Jason [3] . 

The work is an initial proposal that already shows strong potential and sev- 
eral avenues of research. The immediate goal in the improvement of the system 
consists of adding refined strategies and coordination mechanisms, involving for 
instance, payoff, trust, etc. Then, we intend to evaluate the performance and 
quality of the system in several multi-agent domains (e.g., game playing scenar- 
ios, modeling of auctions, and other domains requiring distributed planning). 
We also plan to investigate strategies to enhance performance by exploiting fea- 
tures provided by the constraint solving libraries of SICStus (e.g., the use of the 
table constraint [1]). 

We will investigate the use of future references in the fluent constraints (as 
fully supported in B^'^^) — we believe this feature may provide a more elegant 
approach to handle the requests among agents, and it is necessary to enable 
the expression of complex interactions among agents (e.g., to model forms of 
negotiation with temporal references). In particular, we view this platform as 
ideal to experiment with models of negotiation (e.g., as discussed in [3T]) and 
to deal with commitments [16' (which often require temporal references). 

We will also explore the implementation of different strategies associated to 
conflict resolution; in particular, we are interested in investigating how to cap- 
ture the notion of "trust" among agents, as a dynamic property that changes 
depending on how reliable agents have been in providing services to other agents 
(e.g., accepting to provide a property but failing to make it happen). Also 
concerning trust evaluation, different approaches can be integrated in the sys- 
tem. For instance, a "controlling entity" (e.g., either the supervisor or a priv- 
ileged/elected agent) could be in charge of assigning the "degree of trust" of 
each agent. Alternatively, each single agent could develop its own opinion on 
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other agents' reliability, depending on the behavior they manifested in past 
interactions. 

Finally, work is needed to expand the framework to enable greater flexibility 
in several aspects, such as: 

• Allow deadlines for requests — e.g., by allowing axioms of the form 

request Ci if C2 until T 
indicating that the request is valid only if accomplished within T time 
steps. 

• Allow constraint based delays for requests: 

request Ci if C2 while C3 
indicating that the reqiiest is still valid while constraint C3 is entailed. 

• Allow dynamic changes in the agents' knowledge about other agents (e.g., 
an action might make an agent aware of the existence of other agents), 
or about the world (e.g., an action might change the rights another agent 
has to access/modify some fluents). 
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